{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://github.com/donnemartin/data-science-ipython-notebooks)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Linux Commands\n",
    "\n",
    "* Disk Usage\n",
    "* Splitting Files\n",
    "* Grep, Sed\n",
    "* Compression\n",
    "* Curl\n",
    "* View Running Processes\n",
    "* Terminal Syntax Highlighting\n",
    "* Vim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Disk Usage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Display human-readable (-h) free disk space:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!df -h"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Display human-readable (-h) disk usage statistics:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!du -h ./"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Display human-readable (-h) disk usage statistics, showing only the total usage (-s):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!du -sh ../"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Display the human-readable (-h) disk usage statistics, showing also the grand total for all file types (-c):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!du -csh ./"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Splitting Files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Count number of lines in a file with wc:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!wc -l < file.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Count the number of lines in a file with grep:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!grep -c \".\" file.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Split a file into multiple files based on line count:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!split -l 20 file.txt new"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Split a file into multiple files based on line count, use suffix of length 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!split -l 802 -a 1 file.csv dir/part-user-csv.tbl-"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Grep, Sed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "List number of files matching “.txt\":"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!ls -1 | grep .txt | wc -l"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check number of MapReduce records processed, outputting the results to the terminal:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!cat * | grep -c \"foo\" folder/part*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Delete matching lines in place:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!sed -i '/Important Lines: /d’ original_file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Compress zip\n",
    "!zip -r archive_name.zip folder_to_compress\n",
    "\n",
    "# Compress zip without invisible Mac resources\n",
    "!zip -r -X archive_name.zip folder_to_compress\n",
    "\n",
    "# Extract zip\n",
    "!unzip archive_name.zip\n",
    "\n",
    "# Compress TAR.GZ\n",
    "!tar -zcvf archive_name.tar.gz folder_to_compress\n",
    "\n",
    "# Extract TAR.GZ\n",
    "!tar -zxvf archive_name.tar.gz\n",
    "\n",
    "# Compress TAR.BZ2\n",
    "!tar -jcvf archive_name.tar.bz2 folder_to_compress\n",
    "\n",
    "# Extract TAR.BZ2\n",
    "!tar -jxvf archive_name.tar.bz2\n",
    "\n",
    "# Extract GZ\n",
    "!gunzip archivename.gz\n",
    "\n",
    "# Uncompress all tar.gz in current directory to another directory\n",
    "!for i in *.tar.gz; do echo working on $i; tar xvzf $i -C directory/ ; done"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Curl"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Display the curl output:\n",
    "!curl donnemartin.com\n",
    "\n",
    "# Download the curl output to a file:\n",
    "!curl donnemartin.com > donnemartin.html\n",
    "\n",
    "# Download the curl output to a file -o\n",
    "!curl -o image.png http://i1.wp.com/donnemartin.com/wp-content/uploads/2015/02/splunk_cover.png\n",
    "\n",
    "# Download the curl output to a file, keeping the original file name -O\n",
    "!curl -O http://i1.wp.com/donnemartin.com/wp-content/uploads/2015/02/splunk_cover.png\n",
    "    \n",
    "# Download multiple files, attempting to reuse the same connection\n",
    "!curl -O url1 -O url2\n",
    "\n",
    "# Follow redirects -L\n",
    "!curl -L url\n",
    "\n",
    "# Resume a previous download -C -\n",
    "!curl -C - -O url\n",
    "\n",
    "# Authenticate -u\n",
    "!curl -u username:password url"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## View Running Processes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Display sorted info about processes\n",
    "!top\n",
    "\n",
    "# Display all running processes\n",
    "!ps aux | less\n",
    "\n",
    "# Display all matching running processes with full formatting\n",
    "!ps -ef | grep python\n",
    "\n",
    "# See processes run by user dmartin\n",
    "!ps -u dmartin\n",
    "\n",
    "# Display running processes as a tree\n",
    "!pstree"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Terminal Syntax Highlighting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add the following to your ~/.bash_profile:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "export PS1='\\[\\033[01;32m\\]\\u@\\h\\[\\033[00m\\]:\\[\\033[01;34m\\]\\W\\[\\033[00m\\]\\$ '\n",
    "export CLICOLOR=1\n",
    "export LSCOLORS=ExFxBxDxCxegedabagacad\n",
    "alias ls='ls -GFh'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reload .bash_profile:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!source ~/.bash_profile"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Vim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "Normal mode:  esc\n",
    "\n",
    "Basic movement:  h, j, k, l\n",
    "Word movement:  w, W, e, E, b, B\n",
    "\n",
    "Go to matching parenthesis:  %\n",
    "Go to start of the line:  0\n",
    "Go to end of the line:  $\n",
    "\n",
    "Find character:  f\n",
    "\n",
    "Insert mode:  i\n",
    "Append to line:  A\n",
    "\n",
    "Delete character:  x\n",
    "Delete command:  d\n",
    "Delete line:  dd\n",
    "\n",
    "Replace command:  r\n",
    "Change command:  c\n",
    "\n",
    "Undo:  u (U for all changes on a line)\n",
    "Redo:  CTRL-R\n",
    "\n",
    "Copy the current line:  yy\n",
    "Paste the current line:  p (P for paste above cursor)\n",
    "\n",
    "Quit without saving changes:  q!\n",
    "Write the current file and quit:  :wq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the following command to enable the tutorial:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!vimtutor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the following commands to enable syntax colors:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!cd ~\n",
    "!vim .vimrc\n",
    "# Add the following to ~/.vimrc\n",
    "syntax on\n",
    ":wq"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
